Journal: Cell genomics
Article Title: Benchmarking challenging small variants with linked and long reads
doi: 10.1016/j.xgen.2022.100128
Figure Lengend Snippet: Medically relevant genes in the KIR locus, such as KIR2DL1 , were partially included in v.3.3.2 with many erroneous variants but are correctly excluded by v.4.2.1 because of a likely duplication and other structural variation. Thick blue bars indicate regions included by each benchmark, and orange and light blue lines indicate positions of homozygous and heterozygous benchmark variants, respectively. A duplication of part of this region, which is common in the population, is supported by higher-than-normal coverage and high variant density across all technologies as well as alignment of multiple contigs from the maternal trio-based HG002 Hifiasm assembly (Hifiasm-maternal). The region is very challenging to characterize and assemble accurately because of high variability and copy number polymorphisms in the population as well as segmental duplications (shaded regions).
Article Snippet: DNA extracted from a single large batch of cells for 5 of the 7 genomes (HG001 to HG005) is publicly available in National Institute of Standards and Technology Reference Materials 8391 (HG002), 8392 (HG002-HG004), 8393 (HG005), and 8398 (HG001), available at https://www.nist.gov/srm .
Techniques: Variant Assay